$ sciendo 


BULGARIAN ACADEMY OF SCIENCES 


CYBERNETICS AND INFORMATION TECHNOLOGIES e Volume 19, No 3 


Sofia ¢ 2019 Print ISSN: 1311-9702; Online ISSN: 1314-4081 
DOI: 10.2478/cait-2019-0028 


Uncertainty Aware Resource Provisioning Framework for Cloud 
Using Expected 3-SARSA Learning Agent: NSS and FNSS Based 
Approach 


Bhargavi K.', B. Sathish Babu? 


'Department of CSE, Siddaganga Institute of Technology, Tumkur, Karnataka, India 
?Department of CSE, R V College of Engineering, Bangalore, Karnataka, India 
E-mails: bhargavi.tumkur@ gmail.com bsbabu@rvce.edu.in 


Abstract: Efficiently provisioning the resources in a large computing domain like 
cloud is challenging due to uncertainty in resource demands and computation ability 
of the cloud resources. Inefficient provisioning of the resources leads to several 
issues in terms of the drop in Quality of Service (QoS), violation of Service Level 
Agreement (SLA), over-provisioning of resources, under-provisioning of resources 
and so on. The main objective of the paper is to formulate optimal resource 
provisioning policies by efficiently handling the uncertainties in the jobs and 
resources with the application of Neutrosophic Soft-Set (NSS) and Fuzzy 
Neutrosophic Soft-Set (FNSS). The performance of the proposed work compared to 
the existing fuzzy auto scaling work achieves the throughput of 80% with the learning 
rate of 75% on homogeneous and heterogeneous workloads by considering the 
RUBiS, RUBBoS, and Olio benchmark applications. 


Keywords: SARSA (State-Action Reward-State-Action), Resource provisioning, 
Uncertainty, Soft-set, elasticity, throughput, learning rate. 


1. Introduction 


The cloud resource demands of the complex computational applications in the area 
of engineering, economics, environmental science, and so on, are highly fluctuating 
in nature and consist of data that are uncertain and imprecise, elastic resource 
provisioning becomes one of the critical requirements of such applications. The 
elastic resource provisioning mechanism allows the user to scale up or down the 
resources dynamically at run-time, this feature reduces infrastructure cost and then 
models the application to attain high Quality of Service (QoS) requirement by 
meeting the Service Level Agreements (SLAs). The existing resource provisioning 
approaches can be classified into two types i.e., reactive or proactive, reactive 
approaches take resource provisioning decisions when the load on the system 
resources are high, whereas the proactive approaches estimate the probable load on 
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the system resources and then lease the resources in advance [1-3]. The elastic 
resource provisioning in cloud involves several challenges in terms of existence of 
heterogeneous hardware, maintenance of virtual machine compatibility table, 
periodic updating of states of the virtual machines, frequent failures of nodes during 
scaling, long-term irregular workload parameters, sudden changes in processing 
capability of the resources, frequent violation of SLA, and so on [4-10]. 
Methodologies based on thresholds, time series analysis, queuing theory and control 
theory have failed to provide satisfactory solutions to resource provisioning problem 
as those solutions are affected by the undetermined and erratic changes in the 
processing requirements of jobs and unstable processing behavior exhibited by the 
resources [11-13]. Hence there is a necessity to handle uncertainty in the job and 
resource parameters before taking resource provisioning decisions. 

Many mathematical models are available to handle uncertainty like probability 
theory, interval mathematics, and fuzzy sets. But these techniques have several 
limitations like probability theory is suitable only for stochastically stable phenomena 
and usually takes more trials to provide a solution; interval mathematics fails to 
handle uneven changing in the workload parameters; in the fuzzy set computing 
membership function is tedious as it is not general and is individual specific which 
cannot handle the dynamics of large state space. These drawbacks motivated towards 
soft-set which is parameterized family of a set and does not put any restriction on the 
approximate description as it puts soft boundary depending on the parameters. The 
conventional reinforcement learning techniques draw policies with the assumption 
that the underlying environment is static and do not consider the changing dynamics 
into account but this assumption fails in a highly dynamic environment like cloud, 
this motivated to use soft-set enabled reinforcement learning [14-17]. 

The objectives of the paper are as follows. 

Identify the uncertainty in the jobs and resources by representing their states in 
the form of Partially Observable Markov Decision Process model (POMDP) and 
Hidden Markov Model (HMM) model. 

Handle the uncertainties of the jobs and resources using NSS and FNSS as they 
provide practical frameworks to measure the truth, indeterminacy, and falsehood of 
the data associated with the resource provisioning decisions. 

Design expected 3-SARSA (State-Action Reward-State-Action) learning agent 
empowered with the NSS and FNSS model, which controls the exploration during 
action selection state. 

Evaluate the resource provisioning policies with respect to successful job 
completion rate and learning rate, as SARSA agent updates the resource provisioning 
policies by considering three adjacent expected action-value pairs, which increases 
the learning stability of the agent and even increases the successful job completion 
rate. 

The remaining part of the paper is organized as follows, Section 2 deals with 
related work; Section 3 briefs about the system model; Section 4 gives the high-level 
view of the proposed work; Section 5 does interval-valued NSS analysis of the 
proposed work; and Section 6 deals with result and discussion; and finally Section 7 
draws the conclusion. 
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2. Related works 


The [19] proposed a resource allocation scheme under job uncertainty. Here the 
execution delay of the incoming jobs is predicted using a self-similar long tail 
process, where the similar properties are repeated at different time scales. Then the 
Pareto fractal flow prediction model is used for resource allocation purpose. However 
the basis for allocating the resources is on the assumption that the jobs exhibit similar 
properties but the complex computational jobs are highly random in nature and 
always exhibit an uneven pattern of workload, so the efficiency of the resource 
allocation is found to be below average. 

In [20], a deep reinforcement learning based resource provisioning scheme is 
proposed to minimize the energy consumption of the data centers. Here deep 
reinforcement learning is employed using multiple layers of computational nodes, 
which tries to learn from changing cloud environment to draw optimal resource 
provisioning policies. The scheme is found to be good with respect to energy 
reduction in the large data centers as it effectively handles the sudden burst of the 
workload but the time by the network to convergence is high as it takes too long time 
to balance between exploration and exploitation. 

A reinforcement learning based auto resource scaling system is proposed in 
[21], here multiple reinforcements learning agents with parallel learning policy is 
used to allocate the resources. Each agent has different learning experience and every 
agent share the information learned from the other agents. The parallel learning 
process is found to be good with respect to the rate of learning and Q-Value table 
updating. However, this increases the interaction rate between the agents as huge state 
space need to be considered while deciding the actions, which in turn increases the 
response time of the agents and leads to improper utilization of the resources. 

The [22] proposes a new predictive resource scaling approach for cloud systems. 
The approach extracts the fine-grained pattern from the workload demands and then 
adjusts the resources accordingly. To extract the pattern, signal processing, and 
statistical methods are used. Here the workload patterns are analyzed as it is, i.e., 
uncertainty is not handled, so there was the drop in prediction accuracy, which 
resulted in the increased rejection rate of the jobs. 

An analytical model based auto scaling mechanism is used in [23]. Here an 
analytical model is developed to characterize the workload and to analyze its impact 
on the efficiency of the scale-out or scale-in decisions in the cloud. An inference is 
drawn that scale up is suitable when SLA is strict and scale down is suitable when 
the workload is high. The Kalman filtering-based auto-scaling solution is applied for 
scaling of infrastructure services, as its topology is available. But the model does not 
fit for scaling of software applications as they lack fixed topology. 

A comparison of fuzzy SARSA and fuzzy Q-Learning towards auto-scaling of 
resources in the cloud environment is given in [24]. Both approaches are used to 
efficiently scale the resources under varieties of workload and even maximize the 
resource utilization rate. However the performance of the fuzzy Q-Learning is low 
with respect to learning rate as it always try to compare the actual state with the best 
possible next state while taking actions using fuzzy rule base and the performance of 
the fuzzy SARSA learning is low with respect to adaptability towards heterogeneous 
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workload as the policy formed after the learning phase is not optimized further to 
adapt to uneven pattern in the workload. 

The [25] proposes a self-managed virtual machine scheduling technique for the 
cloud environment. The placement of virtual machines in cloud is one of the 
computation intensive activities, so in this approach, the history of the virtual 
machine’s resource (CPU, memory, hard disk, RAM, network and so on) utilization 
ratio is taken into account to predict the resource utilization level then the decisions 
about virtual machines placement is made. However, the state of virtual machines 
inside the physical machines is not directly visible and it consists of several hidden 
states; as a result, the accuracy of the predicted resource utilization level using various 
machine learning models is less, this resulted in improper placement of virtual 
machines inside the physical machines which leads to a drop in physical machine 
throughput. 

In [26] the heuristic approach is used to schedule the tasks through proper 
distribution of the resources. In this approach every incoming task is processed using 
modified analytic hierarchy process then the resources are scheduled using 
differential evolution algorithm. The analytic hierarchy process ranks the tasks based 
on the requirements of the tasks, however it is not possible to directly rank the tasks 
in the cloud environment as the jobs are usually malleable they start with very few 
resource requirements and then gradually expand to higher resource requirements. As 
a result the application of analytic hierarchy process to malleable jobs leads to 
improper ranking of jobs and the chances of pre-empting the higher priority jobs are 
more which leads to improper utilization of resources. 

The [27] discusses machine learning based resource provisioning techniques for 
the cloud environment. Automated self-learning enabled resource provisioning is 
most important to deliver elastic services to customers by satisfying their needs. Here 
the time series forecasting technique is to predict the number of resources to be 
sanctioned for the incoming client requests and support vector regression model is 
used to forecast the processing capability of the servers. The use of time series model 
in combination with support vector machine is one of the biggest limitations of the 
approach as it fails to capture the chaotic and non-deterministic behaviors of the 
servers and client requests due to the use quadratic programming approach. 

In [28], a deep learning based elastic resource provisioning scheme is proposed 
for the cloud environment. Here three different approaches of deep reinforcement 
learning techniques, i.e. simple deep Q-Learning, full deep Q-Learning and double 
deep Q-Learning are proposed to achieve elasticity in resource provisioning which is 
trained to converge to optimal elasticity policies. All three deep reinforcement- 
learning techniques are capable of learning in a large state space environment and are 
able to collect a sufficient amount of rewards. However, the training of the models is 
computationally expensive and accuracy of the elastic resource provisioning policies 
formed is weak as it operates directly on the partial information exhibited by the jobs 
and resources without the use of any membership functions to handle uncertainties in 
the nested layers of the deep reinforcement techniques. 

A survey of prediction models based resource provisioning techniques available 
for cloud environment is discussed in [29]. Resource provisioning is one of the key 
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issues in the cloud environment as the behavior pattern of workloads keeps varying 
which leads to frequent violations of service level agreements. Various prediction 
models like a neural network, fuzzy logic, linear regression, Bayesian theory model, 
support vector machine, and reinforcement learning are used to estimate the future 
demands of resources. The pros and cons of each of these models are discussed and 
the performance of reinforcement learning technique enriched with the fuzzy logic 
model is good in terms of speed and accuracy of the resource mapping as it is 
proactive in nature and exactly mines the correlation among the variety of resources. 

A reinforcement-enabled technique for energy efficient resource provisioning is 
discussed in [30] to achieve maximum revenue. Here based on the read user 
requirements, the virtual machines and physical machines are hosted in the cloud. 
The resource allocation policy is updated based on the reward collected for energy 
utilization factor of every virtual machine and physical machine in the data center. 
However, in this technique, while predicting the future resource demands, the 
resource like CPU utilization, amount of memory, system availability and system 
performance are assumed to static and transparent in nature which becomes the major 
limiting factor. 

The approach of load balancing among the virtual machines in the cloud data 
center using the Pareto principle is mentioned in [31]. As the computation 
requirement of the applications keeps varying, there is a necessity to scale up and 
scale down the virtual machines so Pareto based genetic algorithm is used to generate 
a large number of solutions and then select one of the solutions as the best one. Here 
the workload requirement of the user is taken directly for analysis without any pre- 
processing; hence there will be an influence of uncertainty over the load balancing 
solutions formed. Moreover, the stringent nature of the genetic algorithm increases 
the time taken to convergence towards an optimal solution and even it fails to arrive 
at the global optimum solution. 

In [32], fuzzy logic based hybrid bio-inspired techniques like Ant-colony, and 
Firefly is developed for placement of the virtual machines within the data center and 
to consolidate the server. Here the basic principle used for server consolidation is to 
pack as many virtual machines as possible within the data center, this works fine on 
steady-state workloads but during the heavy burst of the workloads, it leads to over- 
utilization of the resources. The uncertainty factor is handled through the use of 
fuzzy membership functions inside ant-colony and firefly algorithms but to achieve 
more accuracy there will be an exponential increase in the fuzzy rules and these 
algorithms are old enough and their performance is weak compared to the recent bio- 
inspired techniques like a whale, crow, squirrel, and raven roosting. 

By considering the demand uncertainty, dynamic resource allocation for cloud 
environment is discussed in [33]. The cloud providers allocate resources on the 
reservation basis or on-demand basis, reservation-based allocation of resources are 
carried out on long-term duration which involves lower uncertainty, whereas on- 
demand based allocation of resources is carried out on short-term or long-term 
duration which involves higher uncertainty. In this work the uncertainty in user 
demands are modelled as random variables using stochastic optimization approach 
and an algorithm with two phases is developed, the first phase does the reservation 
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for the resources and the second phase does the dynamic allocation of the resources. 
However, modelling the demand uncertainty using random variables is very difficult 
as it does not possess exact stopping criteria and the uncertainty involved in the 
processing capability of the resources is also ignored which leads to under-utilization 
or over-utilization of the resources. 

To summarize most of the existing works exhibits the following drawbacks. 

e Unable to determine the uncertainty involved in the job processing 
requirement. 

e Unable to determine the uncertainty involved in the resource computation 
ability. 

e Drop in prediction accuracy due to the failure in determining the exact pattern 
in processing requirement in the malleable jobs. 

e By ignoring the hidden states and partially observable states while making 
resource provisioning decisions; the chances of over-provisioning or under- 
provisioning of the resources is high. 

e The conventional reinforcement learning algorithms fail to form robust 
resource provisioning policies, as it cannot capture the chaotic behaviours of the 
servers and client using a deterministic approach. 

e Lack of proactiveness while taking resource provisioning decisions leads to 
a decrease in accuracy and speed to learning. 

e The bio-inspired algorithms fail to arrive at a global optimum solution, and 
even the convergence rate is high due to their harsh approach involved in workload 
analysis. 


3. System model 


This section provides mathematical modeling of the system under consideration. A 
cloud is assumed to be © collection of pool of resources, 


(1) C = (REF. 
Every R; consists of unlimited set of heterogeneous resources, 
(2) Ri = (r}iz0- 
The capacity of every resource is subset of resource pool R*, 
(3) C(7;)eR*. 
The price associated with every resource is subset of price pool P*, 
(4) P(r) € Pt. 


The resources 7; and 7; of cloud are connected through a network with link L ;,, rj 


and the rental time of each resource to process the incoming jobs is limited. 

The jobs are classified into various categories according to their resource 
requirements, i.e., low (/), medium (7m), and high (A), 
(5) J = Ui as {R,(0), R;(m), nha R;(h) } dk = {R,(D, Rem), R,(h) }}. 

The jobs and resources are associated with uncertainties in terms of their 
resource requirements and processing capabilities, which dynamically vary within 
the given time frame T(J7,) , 


(6) TUR) = (TCP), TUE), and T(R;) = (T(RP),..., T(RE)). 
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The uncertainties of the resources are handled using Neutrosophic Soft-Set 
(NSS) theory and the uncertainties of the jobs are handled using Fuzzy Neutrosophic 
Soft-Set (FNSS) theory. 

Let NSS be the neutrosophic soft-set on the universe of discourse U and E is the 
set of parameters: 

(7) NSS = {(u, Tyss (u), Iss (U), Fuss (u))u € U}, 

where T, J, F are Truth value, Indeterminate value and False value, and T, J, F > 
]—0,1*[ and 0° (Tysg (4) + Inss (U) + Fuss (U)) S$ 3%, and E = {E;, 
E,,...,E,} then the collection (F, NSS) is referred as neutrosophic soft-set of 
resources over U. 

Let FNSS be the fuzzy neutrosophic soft-set on the universe of discourse U and 

E is the set of parameters: 

(8) FNSS = {(u, Tenss (), Ipnss (U), Fenss (u)), u € U3, 

where T,/,F > [0,1] and 0 < Tpysg (u) + Ignss (U) + Fenss (u) S$ 3 then the 
collection (F, FNSS) is referred as fuzzy neutrosophic soft-set of jobs over U. 

Later resource provisioning decisions for the jobs are taken using expected 

3-SARSA model, 

(9) (F, NSS), (EF; FNSS) x“ {Q4(S, At), Q? (St, At), Q° (St, A,)}, 

where 

Q4 (Sp At)* = ale + [rer * QP (Sta Ata) — Tez * Q°(St-2, Ara) I], in 
which ™% %~1,and %_2 are the reward obtained at states S;, S;_1, and S;_2 
and A;, A;z_1, and A;_, are the action taken at the states S,, S;_,, and S;_>. 


3.1. Resources model 


The state of the resources in cloud is dependent on the hidden state of the virtual 
machines mounted on top of every physical machine in the cloud resource pool. 
Hence the resources are modeled using HMM, 

(10) HMM(R;) = (S(Rj), V(Rj), BCR), A(R), I(Ri)), 

where: S(R;) = {5,(R;), S2(R;), ---, Sp (R;)} represent the states of the resource; 

V(R;) = {V, (Rj), V2 (Ri), ..., Yr (Ri)} is the value symbols associated with the 
resource; 

B(R;) = {b(V; (Ri))} indicates the output state probability, where 
dizi DV; (Ri) = 1; 

A(R;) = {aij} is the probability of transition from state i to state j, where 
a aj=l, i= 1andj <n; 

T(Ri) = (1, (Rp), 2(Ry), -- » InCRi)} is the initial probability states of the 
resource, where Y=" /;(R,)=1. 

A sample MDP model of resources is shown in Fig. 1. The hidden states of the 
resources are handled using NSS as it characterizes every hidden state of the 
resources while processing the job requests by perceiving all information of the 
unobserved hidden states to optimally switch between exploration and exploitation 
dilemma of the resources while processing jobs requirements using neutrosophic 
truth, indeterminate, and falsehood membership function. 
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HS(Ri): Hidden state of resource 
VS(Ri): Visible state of resource 
VMi: Virtual machine 

PMi: Physical machine 


RPM 


——_ 


Fig. 1. A sample HMM model of resources 


3.2. Jobs model 


Based on the elasticity of resource usage, the jobs are broadly classified into two 
types; they are evolving, and malleable. Evolving jobs operate on the inflatable range 
of computing nodes and the malleable jobs starts with the available fewer nodes and 
gradually expands to more number of nodes. Only partial information is available 
about the resource requirement of the jobs. Hence the jobs are modelled using 
POMDP, 
(11) POMDP(/;) = (SUi), AU), PU), RU), Ui), OV), 
where: 

SG) = {810y), 527i), ---» Sn JGi)} represent the states of the job; 

AG) = {A1 Jj), A2 Gy), ---, An Ji)} represent finite set of actions; 

P(J,) = {aij} is the probability of transition from state i to state j, where 


i a;j=1, i21, and j<n; 
RG) = R(S;G,), AiG) 1s the reward model of the jobs; 
OG) = {94 Ji), A27G,),---, An JG} is the finite set of observations; 
OU) = O( 


sub) is the observation model of the jobs. 


OS(Ji): Origin state of job 
HS(Ji): Hidden state of job 
VS(Ji): Visible state of job 


QoS-i:Quality of Service type 
Ui:User 


Fig. 2. A sample POMDP model of jobs 


A sample POMDP model of jobs is shown in Fig. 2. The partial state of the jobs 
are handled using FNSS as it characterizes every partial state of the jobs by giving 
equal importance to varying resource requirements of the jobs and precisely solves 
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hugely intractable uncertainties in the partial states of the jobs using fuzzy 
neutrosophic truth, indeterminate, and falsehood membership function. 


4. Proposed work 


The proposed NSS and FNSS model based expected 3-SARSA learning resource 


provisioning framework is shown in Fig. 3. 


Auto provisioning decision 


J-FNSSUR R-NSSUR 


for every incoming 
POMDP model of 
job do 


Compute NSS of Compute NSS of 
POMDP model of HMM model of 
job resource 


a 
‘Compute Compute 
fuzzy weighted discernibility 
average of matrix of NSS 
discernibility of adjacent 
matrix of FNSS resources 


of adjacent jobs —T 


‘Compute ‘Compute 
‘standard standard 
minimum of minimum of 
discernibility discernibility 
matrix of FNSS matrix of NSS of 
of jobs resources 


Compute fuzzy 
weighted ove Seek — 
minimum of 
discernibility 
matrix of FNSS 
of jobs 


Output the reduced 
POMOP form of job: 


E(3-Sarsa)-RSA 


Choose an action 
arbitrarily 


for every episode do 


True 


Choose an action to 
observe the reward 
Choose an action using 
the policy derived 


Compute value function 


of A, B, and C states 


Compute Q states of A, 
B, and C 


Rotate Q states of A, B, 


an 


Output resource 
provisioning decisions 


Fig. 4. Flowchart of the proposed framework 


It is composed of three modules; one is Resource Neutrosophic Soft-Set Uncertainty 
Reducer (R-NSSUR), Job Neutrosophic Soft-Set Uncertainty Reducer (J-FNSSUR), 
and Expected 3-SARSA learning agent. The R-NSSUR module handles resource 
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uncertainty, J-FNSSUR module handles job related uncertainty, and Expected 
3-SARSA Resource provisioning Agent (E(3-SARSA)-RPA) forms optimal policies 
for resource provisioning by triple coupling the action-value pair of the agent. Fig. 4 
gives the flowchart representation of the proposed framework. 


4.1. Resource Neutrosophic Soft-Set Uncertainty Reducer (R-NSSUR) 


The R-NSSUR inputs the HMM model of the resource HMM(R;) to construct NSS 
of the HMM model of the resource NSS-HMM(R;) by calculating truth-membership 
function TNSS(R;), indeterminacy function INSS(R;), and falsity membership 
function FNSS(R;) of the resources. The Discernibility matrix is constructed to obtain 
minimal reduct of resources by removing the irrelevant parameters from the hidden 
state of resources D(NSS-HMM(R;, R;)) then the entries in the matrix are normalized 
using standard minimum of the discernibility matrix. Finally the reduced soft-set of 
HMM model of resource HMM(R;,)" is produced as output. The working of 
R-NSSUR is shown in Algorithm 1 and satisfies the Theorems 4.1 and 4.2. 

Algorithm 1. Working of R-NSSUR 

Step 1. Begin 

Step 2. Input: HMM(R;) = (S(R;), VCR), BCR), ACR), (RD) 

Step 3. Output: HMM(R;)"=(S(R;")), V(R;)), B(R;")), A(R;™)), (Ri) 

Step 4. for every R; € R do 

Step 5. Form NSS of HMM(R;), i.e., 

Step 6. NSS-HMM(R;)=(R;, TNSS(R;), INSS(R;), FNSS(R;)) 

Step 7. Calculate the Discernibility matrix D(NSS-HMM(R;, R;)), Le., 

D(NSS-HMM(R;, R;))={a € Alg(Ri, a) # g(Rj, a)} 
® {ai} {ai, aa} 


{ai,a3,d4} 7) 
Step 8. Calculate standard minimum A*D(NSS-HMM(R;, R;)), 1-e., 
Step 9. A*D(NSS-HMM(R;, Rj))= (a; A aj)V (ax A a1) 
{ai*,aa*} {a3*} P 
{a2*} tes {a2*, as*} 
Step 10. End for 


Step 11. Output all reduced resources in HMM form HMM(R;)" 
Step 12. End 


Theorem 4.1. Let HMM(R;) be the input model of the resource parameters and 
HMM(R;)" is the reduced model of the resource parameters, which is obtained by 
eliminating irrelevant parameters y from HMM(R;). Then y is dispensable in 
HMM(R;) if and only if HMM(R,;) — y => HMM(R;)". 

Theorem 4.2. Let (NSS, E) be the input neutrosophic soft-set, where 
E = {e1,€2,...,e;} is the parameter set representing NSS. If there exists irrelevant 


* 0 i : — fp9 pi 
parameters like e; and e;, they are added into reduced parameters set B’= {eg eis 
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to find subset A, i1.e., A © E’then EF — A — E’ produces reduced (NSS, FE ’) excluding 


0 


4.2. Job Fuzzy Neutrosophic Soft-Set Uncertainty Reducer (J-FNSSUR) 


The J-FNSSUR inputs the POMDP model of the job POMDP(;) = 
(SUi), AG), PUD, RGi), 2Vi), OV;)) to construct the FNSS of the POMDP model 
of the job FNSS—POMDPG;) by calculating Truth fuzzy membership 
function Truss Uj), Indeterminacy fuzzy function Ipyss Jj), and Falsity fuzzy 
membership function Fyss (V;) of the jobs. The discernibility matrix is constructed to 
obtain minimal residue of resources by removing the irrelevant parameters from the 


hidden state of jobs (Ns - POMDP(/j;, J )) . Then the entries in the discernibility 


matrix are normalized in three levels, in the first level the fuzzy weighted average of 
discernibility matrix is calculated, in the second level standard minimum of 
discernibility matrix is calculated, and in the third level fuzzy weighted average 
square over standard minimum of discernibility matrix is formulated. Finally the 
reduced soft-set of POMDP model of jobs POMDP(;)rd is generated as output. The 
working of J-FNSSUR is shown in Algorithm 2 and satisfies the Theorems 4.3 
and 4.4. 


i 
and ej. 


Algorithm 2. Working of J-FNSSUR 


Step 1. Begin 
Step 2. Input: POMDPG;) = (SUi), AGi), PUD), RU), 2Ud), OGi) 
Step 3. Output: POMDP(,)"4 = 

= (SU) ATI" PUD RUS, 20)", 00)" 
Step 4. for every J; € J do 


Step 5. Form FNSS of POMDP(/;) 
Step 6. FNSS-POMDP(j)=(Ji, Tenss Ui), /enssUi), Fuss Vi) 
Step 7. Calculate discernibility matrix DOFNSS-POMDP(j;, J;)), 1.e., 
Step 8. D(FNSS-POMDP(/;, J;))={u(@) € Al9 Vi, H(@) # GU), H(@))} 
Step 9. Compute fuzzy weighted average of discernibility matrix, i.e., 

FD = DIE"; * L(a)), where w; is weight assigned 
Step 10. D(FNSS-POMDP(j;, J;))= 

Pp {ai/FD} {ai/FD, a4/FD} 
{ai/FD, a3/FD, as/FD} es @ 

Step 11. Calculate the standard minimum A*D(FNSS-POMDP(j;, J;)) 


A*D(FNSS-POMDP(j;, Jj))= (u(@i ) A H(aj))V (U(x) A H(i) 


{ai*,ao*} {ai*}  {ai*, ac*} 


fas*) fas, ar} 


Step 12. Calculate the fuzzy weighted average square over standard minimum 
of discernibility matrix, i.e., FD?=)3=7(w; * u(a))’, ie., 
Step 13. D(FNSS-POMDP(j;, J;))= 


{aj /FD?, ag /FD?} {a3 /FD?} {®*/FD*} 
104 


Unauthentifiziert | Heruntergeladen 05.10.19 15:25 UTC 


{ag /FD?} 2 {a4 /FD*, az /FD?} 
Step 14. End for 
Step 15. Output all reduced jobs in POMDP form POMDP(;)" 
Step 16. End 
Theorem 4.3. Let (FNSS, (E)) be the input fuzzy neutrosophic soft-set, where 
H(E) = {p(e,) u(e,), «5 u(e;)} is the membership value of the parameter set 
representing FNSS. If there exists membership values of irrelevant parameters like 
u(e;) and ue) which exhibits the probability of greater than 0.5, they are added 
into reduced parameters set u(E’) = {u(e?), u(e;)} to find subset A, i.e., A © w(E) 
then w(E) — A — u(E”) produces reduced FNSS excluding ue?) and u(es), Te: 
(FNSS, “(E)). 
Theorem 4.4. Let f be a function mapping from f:(U;,F) > Ux, E’) then for 
any FNSS, i.e., FNSS(;, F) in f the following conditions hold good: 
f(@) = 9, 
FUE) S fUn E), 
F (Ui E) U Oe ED) = fv E) UF UE), 
F (Ui E) 9 Ue ED) S fv E) 1 f Un E). 


4.3. Expected 3-SARSA resource provisioning agent (E(3-SARSA)-RSA) 


The E(3-SARSA-RSA) model inputs three states among which one is current state 
Q4(S, A) and other two are expected states Q?(S, A), and Q°(S, A), the intention 
behind considering three states are it increases the probability of selecting action with 
highest action value. All three states are initialized to null value, an action is chosen 
in the beginning using an arbitrarily generated policy, the value function is computed 
for expected next two states Vii and VS using which agent updates the states and 
actions, which allows the agent to converge at different values and move towards 
goal by maintaining safe distance from cliff. Later rotate operation is performed on 
all the three Q states with probability P to derive optimal policy for resource 
provisioning, this increases speed of learning and convergences to optimal solution. 
The working of E(3-SARSA)-RSA is given in Algorithm 3 and satisfies the 
Theorems 4.5, 4.6, and 4.7. 

Algorithm 3. Working of E(3-SARSA)-RSA 

Step 1. Begin 

Step 2. Input: O(S, A)= Q4(S, A), Q?(S, A), and Q°(S, A) 

Step 3. Output: Optimal Q(S, A)* for every (S, A) pair 

Step 4. for ever do 

Step 5. Initialize Q(S, AJ={®} 

Step 6. Choose A from S using the arbitrary policy derived from Q(S, A) 

Step 7. for every episode do 

Step 8. Choose A to observe reward and next state (r, 5’). 

Step 9. Choose A’ in S’ using [] derived from Q4(S,A), Q2(S,A), and Q°(S,A). 

Step 10. Compute value function Viiszo yitt(A'/S') A(S", A’) 
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Step 11. | Compute value function Viizogtt(B'/S') Q?(S', B') 

Step 12. Compute value function VG=0 cm (C '15') Q&(S', C') 

Step 13. Compute Q4(5,A)= Q4(S, A)+oly + WVE + yVS — Q4(S,A)] 

Step 14. Compute Q7(S,A)= Q7(S,B)+aly + yVo + yi — Q(S,A)] 

Step 15. | Compute Q°(S,A)= Q°(S, B)+oly + yVi + YG — Q°(S,A)] 

Step 16. | Update S — S'and A< A’ 

Step 17. — Rotate Q4(S,A), Q?(S,A), and Q°(S,A) with probability P 

Step 18. 

Oe 0% 0 
OF ee 
OB Oe 2G: 
Step 19. End for 
Step 20. Compute resource provisioning decisions AD“, AD?’,and AD®, i., 
AD* — YiEtQ4"], AD? & DiET[QP"], and ADT — VETO] 

Step 21. End for 

Step 22. Output resource provisioning decisions AD= {AD4’, AD®’, AD } 

Step 23. End 

Theorem 4.5. For any HMM(R,)"4 of resources and POMDP(J;)'¢ of jobs, the 
computed Q(S, A) of E(3-SARSA)-RSA agent is always greater than computed value 
function of the agent at state S ’ 

Theorem 4.6. If Q(S, A) is the Q state of single SARSA, Q (S, A) is the Q state 
of double SARSA and QS, A) is the Q state of triple SARSA then the learning rate 
a of Q’(S, A) = max(Q(S, A), Q (5S, A)). 

Theorem 4.7. The update rule of SARSA does not converge unless the learning 
rate drops to zero and exploration rate tends to zero, i.e., Q(S,A) = Q(S, A)a[y + 
yV,’— Q(S,A)]. Whereas, expected three SARSA does not wait till the next state 
action 1s performed, it converges as soon as the expected value of next state and action 
is obtained Q4(S, A) = Q4(S,A) + aly + yV3 + yVE — Q4(S,A)]. 


5. Interval-valued analysis 


The efficiency of the proposed work is analyzed using interval-valued NSS analysis 
method [18]. Assume that the AD = {AD“,AD®, AD“’} be the generated 
provisioning decisions under consideration in a real SARSA learning agent and let E 
be the set of parameters describing the quality of AD; € AD and E = {e, = low, 
€, = medium, and e, = high}. The analysis is carried out in following steps. 

e Input the job and resource parameters. 

e Construct n interval valued NSS, ie., INSS; consisting of three 
components, i.e., NSS truth membership function T;,, NSS indeterminacy function 
I, and NSS falsity membership function F,, which are populated as follows: 

INSS, = [Th lg Fe | .»  INSS, = [Ty Lis Fx | 
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INSSin = [Tis Tes Fx | at INSS,, = [Th Tis Fe | 
e Input the threshold of INSS; (a, 8, y) using average decision rules 
e Compute average of INSS,, i.e., INSS,, ®(a, BY) 
Trelis Fe] (Thetis Fee | 


INSS, °(a, BY) = ee 
e Output the average of INSS; (a, B,y), 1.e., INSS,® 
INSS;"°(a, B, Y) eh INSS*"®(a, B,y) 
INSS*Y® (a, B, y) oe INSS*""(a, B, y) 
e¢ Compute the optimal choice Cj=max¢,ec {Ci} 


Example 
A. Triple SARSA learning 
Input a sample POMDP(;)rd = {J;(80),J2(60), J3(40),J4(34), J5(55), 
Je(41), J7(62),Jg(63),Jo(99)} and the HMM(R;)rd ={R1 (32), R259), R3(55), 
R4(61), Rs (83), Re (99), R7 (67), Rg (76), Ro(80) } 
e Construct 3 INSS; (a, B,y) 
[0.5,0.4,0.1] [0.5, 0.3, 0.2] [0.1, 0.1, 0.8] 
[0.4, 0.2,0.4] [0.3, 0.1, 0.6] [0.1, 0.8, 0.1] 
[0.5, 0.4,0.1] [0.3, 0.6,0.1]  [0.3, 0.2, 0.5] 
e Compute the threshold of INSS;(a, 6, y)+={[0.8, 0.2, 0.0], [0.4, 0.6, 0.0]} 
e Compute the INSS;"®(a, B, y)={ADj = 0.5, AD3 = 0.8, AD§ = 0.2 } 
e Summarize the computed INSS,® 
0.8 0.5 0.2 
0.1 0.3 04 
0.5 0.1 0.9 
Output the optimal choice C;=0.9 
B. Double SARSA learning 
Input a sample POMDP(;)" = {J1(80), J2(60), J3(40), J4(34), Js(55), 
Jo(41), J7(62),Jg(63), Jo(99)} and the HMM(R;)" ={R,(32), R2(59), R3(55), 
R4(61), Rs (83), Re(99), R7(67), Rg(76), Ro(80)} 
e Construct 3-INSS; (a, B, y) 
[0.5, 0.3,0.2] [0.6, 0.3, 0.1] [0.0, 0.1, 0.9] 
[0.3, 0.4,0.3] [0.3, 0.4, 0.3]  [0.3, 0.3, 0.4] 
[0.4, 0.2,0.4] [0.1,0.5,0.4]  [0.6, 0.0, 0.4] 
e Compute the threshold of INSS,; (a, B, y)*={[0.4,0.2,0.4], [0.5,0.3,0.3]} 
e Compute the INSS;"®(a, 8, y)={ADj = 0.1,AD3 = 0.2,and AD} = 0.6 } 


e Summarize the computed INSS,® 
0.1 0.5 0.3 
0.8 0.6 0.2 
0.4 0.3 0.2 


Output the optimal choice C;=0.52 
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C. Single SARSA learning 
Input a sample POMDP(Uj)" = {J,(80),J2(60), J3(40),J4(34), Js(55), 
J6(AD), J7(62),Jg(63),Jo(99)} and the HMM(R;)" ={R; (32), R2(59), R3(55), 
R4(61), Rs (83), Re (99), R7 (67), Rg (76), Ro(80) } 
e Construct INSS;(a, B,y) 
[0.3, 0.2, 0.4] [0.2, 0.2, 0.6] [0.4, 0.2, 0.3] 
[0.4, 0.3, 0.3] [0.3, 0.4, 0.3] [0.3, 0.5, 0.2] 
[0.3, 0.4, 0.3] [0.3, 0.4, 0.2] [0.6, 0.2, 0.2] 
e Compute the threshold of INSS; (a, 6, y)*={[0.3,0.3,0.4],[0.5,0.4,0.1]} 
e Compute the INSS;"®(a, B, y)={AD} = 0.1,AD3 = 0.6, and AD3 = 0.3 } 
e Summarize the computed INSS,® 
0.5 0.5 0.3 
0.0 0.2 0.5 
0.6 04 0.8 
Output the optimal choice C; =0.2 
The C; of proposed expected 3-SARSA is 0.9, C; of the double SARSA is 0.5, 
and the C; of the single SARSA is 0.2. Hence the C; of the expected 3-SARSA is 
higher compared to the C; of the double and single SARSA. 


6. Results and discussion 


The performance of the expected 3-SARSA learning in the Proposed Work (PW) is 
compared with the fuzzy SARSA learning in the Existing Work (EW) with respect 
to throughput achieved and rate of learning [24]. The default parameters for the 
SARSA Algorithm are determined by measuring the performance of the jobs running 
on Virtual Machines (VMs) versus resources offered by the VMs. The system-wide 
performance of the jobs running on VMs is evaluated using interactive benchmark 
workloads with varying workload scenarios. 


6.1. Experimental setup 


For experimentation purpose, we used the Xen Hypervisor based paravirtualization 
model, over which more than 100 instances of VM’s have been created. Each of the 
benchmark workloads is deployed on clusters of VM’s, which are enabled with 
Hypertext Preprocessor (PHP) and MySQL accessibility services. To support 
memory-intensive behavior the connections timeout is set to 10 s and to prevent 
bottleneck situations memory consumption limit is not enforced on the applications 
running on VM’s [24]. 


6.2. Benchmark applications 


The typical workloads considered for performance evaluation are RUBiS, RUBBoS, 
and Olio. The RUBiS is a dynamic workload, modeled after the application of 
eBay.com, which consists of the emulator to create client jobs of varying load. The 
RUBBOoS is modeled after the application of slashdot on-line news form, which 
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provides both regular and moderate level of access to clients. Olio is a social events 
calendar application used to support Web 2.0 applications with networking functions 
like commenting on posts, posting the reviews, sharing the post, and tagging friends 
in the posts. To verify the efficiency of the proposed work with respect to throughput 
and learning rate, two types of experiments are carried out one is with the 
homogeneous workload and other with the heterogeneous workload. 


6.3. Experiment-1: Homogeneous workload 


For Experiment-1 the following workloads are being considered, i.e., (RUBiS; 
21,000 browsing clients; time 50 s), and (RUBiS; 30,000 bidding clients, time 50 s). 
Table 1 shows the performance comparison of proposed work with existing work on 
homogeneous workload. 

A graph of the number of iterations versus throughput (number of requests 
successfully completed per iteration) is shown in Fig. 5. The successful job 
completion rate of the proposed work considering RUBiS workload increases with 
the increase in the number of iterations for both browsing and bidding clients as the 
expected 3-SARSA Algorithm learns quickly with minimum exploration rate. But 
with respect to existing work, the successful job completion rate is moderate for 
RUBiS workload with bidding clients and it is low for RUBiS workload with 
browsing clients as the exploration rate of Fuzzy SARSA Algorithm is high. 


9000 


Throughput 
n 
oO 
o 
o 


Q 
100 200 300 400 500 600 700 800 900 1000 
Number of iterations 
Fig. 5. Number of iterations versus throughput 


A graph of time versus learning rate is shown in Fig. 6. The learning rate of the 
proposed work for RUBiS workload with browsing clients is found to be high 
between 0.7 and 0.8 and for the RUBiS workload with bidding clients the learning 
rate is moderate between 0.5 and 0.6 as the expected 3-SARSA Algorithm collects 
maximum possible rewards. Whereas the existing work learning rate for RUBiS 
workload with bidding clients is found to be moderate between 0.5 and 0.6 and 
RUBiS workload with browsing clients the learning rate is lower, i.e., between 0.1 
and 0.2 because the Fuzzy SARSA Algorithm collects minimum possible rewards. 
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Fig. 6. Time versus learning rate 


Table 1. Performance comparison of proposed work with existing work on homogeneous workload 


Performance metric 
Throughput (3000-9000 jobs) 
Works considered Workload woe Number of iterations (100-1000 iterations) 
for analysis YP Fewer Moderate Higher 
iterations iterations iterations 
(100-400) (400-700) (700-1000) 
Proposed work RUBiS: Browsing 5000-8000 8000-9000 8000-9000 
RUBiS: Bidding 7000-8000 6000-8000 7000-8000 
Existing work RUBiS: Browsing 5000-6000 4000-6000 4000-5000 
RUBiS: Bidding 6000-7000 6000-7000 6000-7000 
Learning rate (0-1) 
. Time interval (100-1000 ms) 
— eonsidersd Workload type Lower Moderate Higher 
or analysis fone eae es 
time interval time interval time interval 
(100-400) (400-700) (700-1000) 
Proposed work RUBiS: Browsing 0.7-0.75 0.7-0.8 0.7-0.8 
RUBiS: Bidding 0.5-0.6 0.5-0.6 0.5-0.56 
Existing work RUBiS: Browsing 0.5-0.6 0.5-0.55 0.5-0.55 
RUBiS: Bidding 0.2-0.22 0.1-0.2 0.2-0.22 


Table 1 compares the performance of the proposed work with the existing work 
concerning performance metrics like throughput and learning rate under the 
homogeneous workload. Concerning RUBiS workload the performance of the 
proposed work is very high towards throughput and is moderate towards learning rate 
whereas the performance of the existing work is moderate towards throughput but is 
weak towards learning rate. 


6.4. Experiment 2: heterogeneous workload 


For Experiment 2 the following workloads are being considered, i.e., (RUBiS; 3,000 
browsing clients, 13,000 selling clients; time 50 s), (RUBBoS; 30,000 bidding 
clients, 12,000 concurrent clients; time 50 s), and (Olio; 30,000 concurrent clients; 
time 50 s). Table 2 shows the performance comparison of proposed work with 
existing work on heterogeneous workload. 
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Table 2. Performance comparison of proposed work with existing work on heterogeneous workload 


Performance metric 
; e Throughput (3000-9000 jobs) 
oe a Workload type Number of iterations (100-1000 iterations) 
: Fewer iterations Moderate iterations Higher iterations 
(100-400) (400-700) (700-1000) 
Proposed wonke RUBiS: Browsing 6000-9000 7000-9000 7000-8000 
RUBiS: Selling 7000-8000 7000-7500 7200-7500 
Existing work RUBiS: Browsing 5000-6000 5000-6000 4000-5000 
RUBisS: Selling 2000-3000 2000-5000 3000-4500 
Learning rate (0-1) 
Works considered Workload type Time interval (100-1000 ms) 
for analysis Lower time interval | Moderate time interval | Higher time interval 
(100-400) (400-700) (700-1000) 
Piopeced work RUBiS: Browsing 0.7-0.75 0.7-0.8 0.75-0.80 
: RUBiS: Selling 0.5-0.7 0.55-0.65 0.65-0.75 
Racine work RUBiS: Browsing 0.2-0.4 0.2-0.4 0.2-0.4 
RUBiS: Selling 0.2-0.3 0.1-0.3 0.1-0.2 
Throughput (3000-9000 Jobs) 
Works considered Workload type Number of iterations (100-1000 iterations) 
for analysis Fewer iterations Moderate iterations Higher iterations 
(100-400) (400-700) (700-1000) 
Pisposed Sank RUBBosS: bidding 6000-9000 7000-9000 7000-7300 
RUBBoS: concurrent 4000-7000 4000-7000 6500-7000 
Henne werk RUBBos: bidding 5000-8000 5000-7000 6000-7000 
RUBBOoS: concurrent 3000-3500 3500-4000 3000-3200 
Learning rate (0-1) 
Works considered Workload type Time interval (100-1000 ms) 
for analysis Lower time interval | Moderate time interval | Higher time interval 
(100-400) (400-700) (700-1000) 
Proposed RUBBosS: bidding 0.7-0.9 0.7-0.9 0.7-0.9 
work RUBBOoS: concurrent 0.7-0.9 0.5-0.9 0.7-0.72 
Baisting Wwotke RUBBosS: bidding 0.3-0.5 0.5-0.51 0.3-0.4 
RUBBoS: concurrent 0.1-0.5 0.1-0.5 0.3-0.5 
Throughput (3000-9000 jobs) 
Works considered Warldloadtype Number of iterations (100-1000 iterations) 
for analysis Fewer iterations Moderate iterations Higher iterations 
(100-400) (400-700) (700-1000) 
Proposed work Olio: concurrent 6000-7000 6000-7500 7000-8000 
Existing work Olio: concurrent 2000-5000 2500-3500 3000-5000 
Learning rate (0-1) 
Works considered Workload type Time interval (100-1000 ms) 
for analysis Lower time interval | Moderate time interval | Higher time interval 
(100-400) (400-700) (700-1000) 
Proposed work Olio: concurrent 0.1-0.6 0.6-0.62 0.6-0.9 
Existing work Olio: concurrent 0.1-0.6 0.2-0.6 0.2-0.4 
RUBiS workload 


The performance of the proposed and existing work is evaluated with respect to 
throughput and learning rate by considering browsing and selling clients of RUBiS 
workload. 

A graph of the number of iterations versus throughput with respect to RUBiS 
workload with browsing and selling clients is shown in Fig. 7. The successful job 
completion rate is found to be high for the proposed work with both browsing and 
selling clients as the dynamic nature of the RUBiS workload is handled smoothly 
using NSS and FNSS enabled 3-SARSA Algorithm which is capable of handling 
different uncertainties in the input parameters. Whereas the successful job completion 
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rate of the existing work is found to be lower for selling clients and moderate for 
browsing clients as the dynamic nature of the RUBiS workload is not handled 
properly in Fuzzy SARSA Algorithm because of the use of non-differentiable 
polygon membership function. 
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Fig. 7. Number of iterations versus throughput 


A graph of time versus learning rate with respect to RUBiS workload with 
browsing and selling clients is shown in Fig. 8. The learning rate is high for the 
proposed work with browsing clients as it falls in the range of 0.7 to 0.8 and for 
selling clients it is in the moderate range, i.e., between 0.5 to 0.8 owing to the 
approximate and easily adaptable nature of 3-SARSA Algorithm. But the existing 
work learning rate is lower for both browsing and selling clients due to the individual 
specific nature of the polygon membership function used in the Fuzzy SARSA 
Algorithm. 
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Fig. 8. Time versus learning rate 
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RUBBoS workload 
The performance of the proposed and existing work is evaluated with respect to 


throughput and learning rate by considering bidding and concurrent clients of 
RUBBOoS workload. 
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Fig. 9. Number of iterations versus throughput 


A graph of the number of iterations versus throughput with respect to RUBBoS 
workload with bidding and concurrent clients is shown in Fig. 9. The successful job 
completion rate of the proposed work is high for bidding clients and remains 
moderate for concurrent clients as the 3-SARSA Algorithm easily handles the 
stochastically unstable phenomena in the workload using NSS and FNSS theory. 
Whereas the successful job completion rate of the existing work is found to be high 
for the bidding clients and low for concurrent clients as the Fuzzy SARSA Algorithm 
cannot easily handle the stochastically unstable phenomena in the workload because 
of the tedious procedure involved in the calculation of fuzzy membership function. 
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A graph of time versus learning rate with respect to RUBBoS workload with 
bidding and concurrent clients is shown in Fig. 10. The learning rate of the proposed 
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work remained constant between 0.7 and 0.9 for the proposed work with both bidding 
and concurrent clients owing to the exploratory learning policy of 3-SARSA 
Algorithm. Whereas, the learning rate of the existing work is found to be lower 
between 0.1 and 0.5 for concurrent clients and moderate for bidding clients owing to 
the non-exploratory learning policy of Fuzzy SARSA Algorithm. 

Olio workload 

The performance of the proposed and existing work is evaluated with respect to 
throughput and learning rate by considering concurrent clients of Olio workload. 

A graph of the number of iterations versus throughput with respect to Olio 
workload made up of concurrent clients is shown in Fig. 11. The successful job 
completion rate of the proposed work is found to be moderate as the 3-SARSA 
Algorithm can capture maximum possible uncertainties in the incoming workload 
using NSS and FNSS theory. But there is a huge drop in the successful job completion 
rate for the existing work as the Fuzzy SARSA Algorithm fails to capture all possible 
uncertainties in the incoming workload using not so continuously differentiable 
polygon fuzzy membership function. 
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Fig. 11. Number of iterations versus throughput 


0.9 T T r r r + + + 
Olio-Concurrent ;P¥—+— 
Olio-Concur 


Learning rate (0-1) 


4 4 n 4 4 4 


“100 200 300 400 500 600 700 800 900 1000 


Time (ms) 


Fig. 12. Time versus learning rate 


A graph of time versus learning rate with respect to Olio workload with 
concurrent clients is shown in Fig. 12. The learning rate of the proposed work 
considering concurrent clients is moderate between 0.6 and 0.8 because of the 
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superior resource provisioning ability of the 3-SARSA Algorithm as it considers 
expected three states while forming the resource provisioning policies. Whereas, the 
learning rate of the existing work with concurrent clients is found to be fluctuating 
between 0.1 and 0.6 in a scale of 0 to 1 owing to not so superior resource provisioning 
ability of the Fuzzy SARSA Algorithm because it does not considers adjacent states 
while forming resource provisioning policies. 

Table 2 compares the performance of the proposed work with the existing work 
concerning performance metrics like throughput and learning rate under the 
heterogeneous workload. Concerning RUBiS workload; the performance of the 
proposed work is high towards throughput and is moderate towards learning rate 
whereas the performance of the existing work is weak towards both throughput and 
learning rate. Concerning RUBBoS workload; the performance of the proposed work 
is moderate towards both throughput and learning rate whereas the performance of 
the existing work is moderate towards both throughput and learning rate. Concerning 
Olio workload the performance of the proposed work is high towards throughput and 
is moderate towards learning rate whereas the performance of the existing work is 
weak towards throughput but is moderate towards learning rate. 


7. Conclusion 


The paper presents a new NSS and ENSS based expected 3-SARSA learning 
framework for resource provisioning in the cloud environment. Here the irrelevant 
parameters or outliers of the jobs and resources are reduced, this influences on the 
quality of the resource provisioning decision taken. The proposed agent compares the 
current state with the expected other three states to form optimal decision pertaining 
to resource provisioning, which increases the number of rewards collected by the 
agent and stabilizes the learning. Its performance is found to be good with respect to 
successful job completion rate and learning rate. In future work, the expected 
3-SARSA learning framework is improvised to be self-adaptable and capable enough 
of doing both resource scheduling and resource provisioning at runtime with 
minimum SLA violation, and the cost incurred. 
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